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Abstract 

Recovering a stochastic process from noisy ensembles of single par¬ 
ticle trajectories (SPTs) is resolved here using the Langevin equation 
as a model. The massive redundancy contained in SPTs data allows 
recovering local parameters of the underlying physical model. We use 
several parametric and non-parametric estimators to compute the first 
and second moment of the process and to recover the local drift, its 
derivative and the diffusion tensor. Using a local asymptotic expansion 
of the estimators and computing the empirical transition probability 
function, we develop here a method to deconvolve the instrumental 
from the physical noise. We use numerical simulations to explore 
the range of validity for the estimators. The present analysis allows 
characterizing what can exactly be recovered from the statistics of 
super-resolution microscopy trajectories used in molecular trafficking 
and underlying cellular function. 


1 Introduction 

The redundancy of many short single particle trajectories (SPTs) are nec¬ 
essary to extract physical parameters from empirical data at a molecular 
level [muiE], while long isolated trajectories have been used to extract sec¬ 
ond order properties of a Brownian motion using mean-square displacement 
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analysis [3111] [SlIS]. Some geometrical properties can also be recovered from 
long trajectories, snch as the radius of confinement for a confined Brownian 
motion [7]. In the context of cellular transport (ameoboid), high resolution 
motion analysis of long SPTs [10] in micro-fluidic chambers containing ob¬ 
stacles revealed different type of cell motions. Depending on the obstacle 
density, crawling was found at low density of obstacles m and directed and 
random phases can even be differentiated. In high density regions, the motion 
is rather directed from obstacle-to-obstacle [I2|. 

Under additional assumptions about the physical process and with the 
advent of massive high resolution microscopy data, it has been recently pos¬ 
sible to recover additional features from many short trajectories such as the 
local drift, the diffusion tensor and even potential wells that are rehned local 
structures, generating conhnement due to a direct field of forces Emi. 
Moreover, with a model of obstacles organization, the map of diffusion co¬ 
efficient can be converted into a density of obstacles [9]. Several approaches 
have been proposed to reconstruct diffusion properties from empirical esti¬ 
mators of a large ensemble of single noisy trajectories [131 [S], even when 
trajectories are sampled and recorded points contain additional noise due to 
background limitations [15] . Precise and careful estimates [I311I1] have been 
obtained for pure diffusion processes (no drift). 

In this article, we present a general analysis of short stochastic trajectories 
that can be driven by a non-constant drift. The drift analysis is relevant when 
a tracked particle experiences direct interactions or becomes conhned by a 
potential well, that needs to be resolved and parameters are extracted from 
data. Because empirical data can be potentially noisy, the drift term can be 
affected by noise in the measurement, such as tracking noise, thus requiring 
a careful interpretation of the data analysis. As we shall see here, when 
a stochastic particle crosses a potential well, the second derivative of the 
potential well is an additional term that contributes to the expression of the 
measured diffusion coefficient. Thus, a deconvolution of the trajectories is 
needed to remove instrumental noise or tracking error that affect the recovery 
of the physical from the measured trajectory. 

Deriving analytical formula allows resolving precisely the contribution of 
each term and recovering the physical dynamics by computing the hrst and 
second moments from data. The empirical data are presented as a collection 
of discrete trajectories obtained at a fixed time resolution At. We present 
both parametric and non-parametric estimators and the underlying physical 
model here is the Smoluchowski limit of Langevin’s equation. In addition 
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to several estimators and their analysis, numerical simulations are used to 
explore the range of validity of the estimators. This article is organized as 
follows: the hrst part is dedicated to construct non-parametric empirical es¬ 
timators from a stochastic analysis in the entire space d = Second, 

we derive analytical formula for the hrst and the second moment. In the third 
part, we study the parametric estimators for an Ornstein-Uhlenbeck process 
and obtain specihc estimates. The analytical formula for the estimators are 
compared to numerical simulations. We conclude that this analysis supports 
that biophysical properties of a membrane can be recovered from the empir¬ 
ical estimators of many SPTs and potential wells are physical objects DEI 
and not a hction of tracking algorithms. 


2 Estimations of a stochastic process using a 
non-parametric estimators 

2.1 Stochastic Model 

The physical motion of a stochastic particle is modeled by the Smoluchowski’s 
limit of the Langevin equation resulting in the equation of motion 

X = a{X) + b{X)w, (1) 


where a is a deterministic drift, b the diffusion tensor and w the classical 
Wiener (5-correlated noise. The Ito’s integral leads to 


X{t) 


X{u)+ / a(X(s))(is-1- / b{X{s))dWs 


( 2 ) 


and at times 0, At ,..., nAt, 


p{n-\-l)At 

J nAt 

^{n-\-l)At 


a{X{s))ds = a(Xn)At + o{At) 


' nAt 


b{X{s))dWs = b{Xn)Aw, 

the discrete approximation sequence is 

Xn+l = Xn + a[Xn)At b[Xn)Aw, 


(3) 

(4) 

(5) 
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where = X{nAt). 

We assume that the physical motion is sampled at points at each 
time step, an additive Gaussian noise is added and the observed motion is 
described by 

Yn = Xn + Zn, wliere = at]^ ( 6 ) 

and 77 ^ is a n-dimensional Gaussian variable. We shall present various sta¬ 
tistical parametric and non-parametric approaches to recover the underlying 
stochastic component of the continuous variable X from the empirical mea¬ 
sured sequence Yn- 


2.2 Recovering the empirical transition probability den¬ 
sity function in R 

We compute here the transition probability p{y,n -|- l|a;,n) = Pr{Yn+i = 
y\Yn = x} in one dimension when the diffusion tensor b{Xn) = \/2D is 
uniform: 


p{Yn+l = y\Yn = x) = p{Xn+l + Zn+l= y\Xn + Zn = x). (7) 

The two processes and Zn are independent and in R, we have 

p{Yn+l = y\Yr, = x) = / p{Xn+l + Zn+l = |/|X„ = Xi)p{Zn = X- Xi)dXi 




p{Xn+i = yi, Zn +1 = y - yi\Xn = xi)p{Zn = x - Xi)dxidyi 


p{Xn+i = yi\Xn = xi)p{Zn+i = y- yi)p{Zn = x- xi)dxidyi 


{x-xif {y-yif 



e 2ct 2 e 2 (t2 

p{Xn+i = yi\Xn = xi). -=- j= — dxidyi. 


a 




( 8 ) 


For At <C 1 and X„,+i — X^ ~ A/”(a(A„)Af, y2DAi), the pdf is 

idJi - xi - a{xi)At)‘^ 
e 


p{Xn+l = yi\Xn = Xi) = 


ADAt 


^/AnDAt 


(9) 
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which gives that 


piYn+i = y\Yr, = x) = 



{yi -xi- a{xi)Aty {x -xiY {y - yi)^ 

e 2(t 2 e 


WAt 


2a2 


\/4:TrDAt 


a 




a 




(x-xi)2 {y - Xi - a{xi)Atf 
^^2 g 2(a2 + 2DAt) 

+ 2D At) 


dxi- 


a 


To obtain an explicit expression of this convolution, we use the change of 
variable Xi = x + ay where a -C 1, 

{y — X — ay — a{x + ay)AtY 
e"Y e” 2(a2 + 2D At) 


p{Yn+i = y\Yn = x) = 


-dy. 


^27r(cr^ + 2D At) 

M 

Using a Taylor’s expansion, we have a{x + ay) = a{x) + aya\x) + o(cr) and 


p{Yn+i =y\Yn = x) = 


^2 i^y — X — a(x)At — ay(l + a'(x)At))" 

e~Y e~ 2(a^ + 2DAt) 


\/2^ 


y^27r{a^ + 2D At) 


-dy 


y'^ - 


{y - 


y—x—a{x)At '^2 
a{l+a' {x)At) 




_j. _ o cr^-\-2DAt 

- 2 P ^a^{l+a^ix)At)^ 


a{l + a'{x)At) J / a^+ 2 DAt~ /ttz 

R Y a2(l+a'(x)Ap2 V/VT 


■dy 


y—x—a(x)At \ " 
f7(l+a^(tc)At) ) 


21 + 


a-2+2DAt 

cr^(l+a^(x)A£)^ 


cr(l + a'{x)At) 


{y — X — a{x)At) 


cT^+2DAt 

(T2(i+a'(x)Ap2 
2 


g 2(2a2(l+ a'(x)Af)+2T)Af + 0(Af)2) 
\/^\/2a‘^{l + a'{x)At) + 2D At + 0(Af)^ 


dxidyi 
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We obtain that 


p{Yn+i = y\Yn = x) 


{y — X — a{x)At)'^ 
(TAt{x)V^ 


where 


= 2(T^(1 + a'{x)At) + 2D At + 0(At)^. 


( 10 ) 


( 11 ) 


We conclude that the transition probability of the observed process W is 
Gaussian and W+i — W ~ Ar{a(Yn)At, (Ti(W)). The observed motion is thus 
dehned by the discrete scheme: 

Y^tit + At) = Y^tit) + aobs{yAt)At + ^obs,^At) ^^2) 

where AlW = W{t + At) — W{t) and W is a Brownian motion of variance 1 
and 


aobs{x) = a{x) (13) 

(^obs,Ati^) = (^At{x) = ^/2a‘^{l + a'{x)At) + 2DAt + 0{At)‘^. (14) 

This approach allows dehning the continuous process Y^t from the approxi¬ 
mation at the scale At, it is solution of the stochastic equation 

dY^tis) = a(yAi)ds + (15) 

The drift of the observed process at a time resolution At is the same (at hrst 
order in a) as the physical one, while the diffusion tensor is changed and 
given by formula flTT]) . 


3 Estimating the drift and diffusion tensor 


Optimal estimators of the physical process ([T]) are constructed from Feller’s 
formula [21 [161 El EH] 


a(X) 


lim 

—^0 


E(X(f +At) -X(f) |X(f) 
At 



(16) 
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where the average E(.|X(t) = X) is taken over the trajectories passing 
through point X at time t. Similarly, the second moment is given by 


2b‘j(X) 


, E(X(t + At) - X(t))UX(t + At) - X(t))\ \X(t) 
hni -T- 

At^O At 


X) 


UT) 


In practice, this inversion procedure requires combining several independent 
trajectories passing through each point of a domain. The drift and the dif¬ 
fusion tensor can be recovered from many empirical trajectories. In the next 
section, we generalize these formula to extract the underlying physical pro¬ 
cesses (drift and tensor) from observing a discrete ensemble of trajectories 
Yn at time resolution At. 


3.1 Recovering the drift in dimension 1 

The infinitesimal operator of the observed process Yn defined by eq. (jb]) is 
Gaussian and the associated stochastic discretization equation is flT^ (section 
12.2p . Thus, an estimator for the drift at a time resolution A of the observed 
process is 


aAt{x) 


E 


h^n+l - 

At 


\Yn = X 


— / (l/ - x)p{Yn+i = y\Yn = x)dy 

Jr 

a{x) + o(l). 


(18) 


The average eq. (1T8|) computed from an observed trajectories gives the same 
drift component as the initial physical process in the limit 


lim ttAtix) = a(x). (19) 

At^O 

Thus adding a Gaussian noise on the physical process, sampled at any rate, 
does not alter the physical deterministic drift at first order in a (see appendix 
for the second order). 


7 






3.2 Recovering the diffusion tensor in dimension 1 

The diffusion tensor at position x of the observed trajectories is estimated as 




E 


2At 



^ j (y~ ^fpO^n+i = y\Yn = x)dy 
^ + D + (j‘^a'{x) + + o{At), 


( 20 ) 


where the transition probability of the observed process is computed from 
expression flTOl) . This result shows that at a time resolution At, estimator 
(l20|) contains an additional term a‘^a'{x) to the diffusion coefficient of the 
physical process. In practice, the held a(x) is recovered from estimator (lT8l) . 
and the resolution At is hxed, the amplitude of the noise a is calibrated from 
instrumental noise. It is then possible to recover the diffusion coefficient D. 
A general expression for a diffusion tensor D{x) is derived in appendix B. 

Using formula (l20l) . with 10,000 points, we estimated the diffusion co¬ 
efficient D in Fig. [Th-B. The signal to noise ratio (SNR) is dehned as 
We also estimated the diffusion coefficient for an Ornstein-Uhlenbeck 

A£ 

process (Fig. [Tp-D). These numerical results show that the estimator for 
the diffusion coefficient is biased for a high SNR, due to the approximation 
Xn+i — Xn ~ N'{a{Xn)At, \/2DAt), which is only valid for a small time step 
At. However this approximation is acceptable for a Brownian motion, as 
shown in Fig. [T]A. and[T]B. 


3.3 Other estimators 

For a stochastic process containing a drift component, it is not possible to 
extract the physical diffusion coefficient directly by combining the hrst and 
the second moment estimators, which is in contrast with the pure diffusion 
case (see mm)- We now present an estimator where the Gaussian instru¬ 
mental noise can be eliminated. Using the difference AY^ = Y^+i — Yn, we 
can rewrite 

Al ^n = a{Xn)At + a{Xn)AWn + cr{r]n+l - Vn) 

Al^n -1 = a{Xn-l)At + a{Xn-l)AWn-l+a{r]n-Vn-l), 
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Figure 1: Estimation of the diffusion coefficient for a Brownian Mo¬ 
tion (BM) and Ornstein-Ulhenbeck process (OU). Trajectories of 
Brownian motion (A,B) and an Ornstein-Uhlenbeck process (C,D), were 
simulated using Euler’s scheme. Trajectories were sub-sampled to 10,000 
points in trajectories. A position noise was added to the process. The diffu¬ 
sion coefficient D^t is estimated using formula (160|) . The signal-noise ratio 
(SNR) is defined as The dot line represents D = D^t — ^ and the 

AI 

continuous line represents D ± std. The diffusion coefficient is set to H = 1 
and the drift is a(x) = —2x. (A,C) The positional noise is fixed at a = 0.1, 
while the sampling rate At is varying. (B,D) At = 0.001 and the position 
noise a is varying. 
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where AhF„ and ^Wn-i are two independent increments of Brownian motion. 
The expectation is 


{Yn+i-Yn){Yn-Yn-i) 

At 


Using relation fl20|) . we obtain that 




(r„+i - y„)- 
2At 


\Y„. = X 


+ E 


At 


= D + a^a'{x) + o(l).(22) 


In this estimator, the instrnmental noise is averaged ont. However, there are 
no direct procedures to get rid of the derivative of the drift term, which can 
be extracted from the hrst order moment. However, computing a derivative 
from noisy data should be done carefully as it introduces singularities and 
irregularities. 


3.4 Empirical estimators associated to an Ornstein- 
Uhlenbeck process 

We shall now consider an Ornstein-Uhlenbeck process. 


dX = -X{X - fi)dt + V^dW, 


where the pdf is 


p{y,t\x,0) 


{y- y - {x-y)e 


27rD 


(l g-2At) 


; exp(- 


2D 

X 


(1 — 


(23) 


(24) 


In the discretized setting, the pdf between two time steps separated by an 
interval At associated to the observed motion W, can be computed from eq. 
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([H]) and is given by 


p{Yn+i = y\Yn = x) = 



{x-x,f {y-yif 


3-2AAi'i 


2^2 


2a2 


V'2’'x(l - 


(T\/2^ 

-\At\2 


a 




-dxidyi 


(x-xif (y - /i - (xi - /i)e ) 

-:- 0/'^2 I -D/'l -2AA^^^ 


2a2 


2(a2 + f(l -e-2A^*)) 


-cia;i 


/ + - 2(1 - 
{y — y— {x — 

~2{a\l + + f (1 - e-2^^*)) 

^27r(a2(l + + f (1 - 6-2^^*)) 

The local dynamics can be recovered from the trajectories by computing the 
observed drift at time scale At, which is given by 


(25) 


flAt (^) = ^ “ x)p{Yn+i = y\Yn = x)dy 


= -{x-p) 


1 — e 


-XAt 


At 


(26) 


which generalizes relation fllSp . Similarly, the observed diffusion coefficient 
is 


= —- 
^ ^ 2At 


{y - xfp(Yr,+i = y\Yn = x)dy 


2At 


D 


^2(1 ^ ^ _ g-2AAt) ) + (^ _ ^) 


(l_g-AAt)2 


2At 


(27) 


In Fig. [21 we estimate the local drift and diffusion coefficient for an OU- 
process and compare the local estimators for the drift (IT8|) with relation 
(12^ . For the diffusion tensor, we compare relations fl20p and fl27p . At hrst 
order approximation for short time step At, estimators flTHlh resp. (l2Up j gives 
similar result as fl2^ (resp. (ETP). 
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Figure 2: Estimating the local drift and diffusion coefficient for 
an Ornstein-Uhlenbeck process. (A) trajectory of a one dimensional 
OU process. The OU process is generated using Euler’s scheme (blue curve) 
and the observed trajectory (red curve) is obtained by subsampling at At = 
0.1 and with an additional position noise of standard deviation a = 0.05 
(SNR=40). The other parameters are H = 1, A = 2, /i = 0. The observed 
trajectory contains 10,000 points. (B) Estimation of the local drift using eq. 
fl5^ (black dots), and comparison with the analytical formulas flTSl) (red) 
and fl26|) (blue). (C) Estimation of the local diffusion coefficient using eq. 
fl60|) (black dots), and comparison with the analytical formulas fl20|) (red) 
and fl27)) (blue). 
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3.5 Estimating the motion of an immobile particle and 
criteria of detection 


When a particle is fixed at position Xq, the sampled trajectories are generated 
by the noise localization with variance a. Computing the first moment shows 
that the particle is not moving and the second moment is used to extract the 
variance a. The observed dynamics is given by the stochastic equation 


hji Xq -|- CTTjYii 


where 7]n are i.i.d Gaussian variable of variance 1. The transition probability 
reduces to 

{y - Xof 


p{Yn+l = y\Yn = x)= p{Yn+l = v) = 

and the empirical estimator of the drift is 

1 


2ct2 


a 




1 

At 


{y - x)p{Yn+i = y\Yn = x)dy 
jy-Xof 

[y-x) - 7 =- dy 


a 




which should be compared to relation ffTSj) : the estimator now depends on 
the time resolution At and the location of the pinned particles, which can 
be determined by the empirical averaging - Yk- The sum is converging 
(in probability) as n goes to infinity to the mean lE(Ti) = Xq. Thus contrary 

to the case of a physical drift, the empirical sum T ^ 

— — Wo), which depends on the time step At. 


N 


At 


converges 


Similarly the second moment estimator gives for the diffusion coefficient 
the following expression 

_(^_ 

■(rn+i-W)T 1 I r. ..e 2a2 




DAt{x) = E 


2At 


\Yn = X 


2At 


{y - xY 


a 




1 


2At 


{{x-XYYyo^). 


-dy 

(28) 
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By fixing the center x = Xq, the empirical estimator (125]) allows estimating 
and the variance a. 

This example is instructive because it allows differentiating a fixed particle 
from one trapped in a potential well (see paragraph below). In summary, the 
following criteria can be used: the first moment (velocity) computed from 
a sample trajectory for a fixed particle depends on the time resolution At, 
which is not the case for a physical particle trapped in a potential well (see 
relation (fT8|) i. 
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4 Estimators for a multidimensional diffusion 

• 71 

process m m 


To generalize to higher dimensions the results we derived for dimension one, 
we start with a m-dimensional stochastic equation that represents a physical 
processes, sampled at discrete time steps of length At: 

X„+i = Xn + A{Xn)Xt + \/'2DAw (29) 

where A is a vector held and Aw the classical m—dimensional centered 
Brownian motion of variance 1. The diffusion tensor is assumed to be a 
constant D. As described by eq. ([6]), the observed motion is observed by the 
time sequences 

Yn = Xn + (J'nn, (30) 

where rjn is a m-dimensional standard gaussian. The transition probability 
between points Yn and Yn+i is 


p{Yn+i = y\Yn = x) = 



p(X„+i = yi\Xn = Xi)p{Zn+i = y-y^)x 
p{Zn = X - Xi)dxidy^ 


\X — Xi\ 


\y - yi\ 



e 2a2 e 2^^ 

p{Xn+i = yi\Xn = Xi )——=--^— dxidy^ 


(aV^)™ 


Using the distribution Xn+i — Xn ~ Mm{A{Xn)At, we obtain 

that the transition probability is 




y^-xi- A(a;i)At||^ 
ADAt 

(VAirDAt)^ 


We hrst integrate w.r.t. yi and obtain 


p{Yn+i = y\Yn 


x) 



||a?-a^i||^ \\y - Xi - A{xi)At\f 

■ 2 ^^ 4:DAt + 2 ct 2 

(cr-v/^)™ y^27r{2DAt + a^)”^ 
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Changing variable Xi = x + ar], with a 1, we obtain that 


p{Yn+i = y\Yn = x) = 


\W 

e 2 e 


{y — X — arj — A(x + arj)AtY 
2(a2 + 2D At) 




^27r(2DAt + 




-drj. 


Using a Taylor expansion of the drift at the hrst order, 

A{x + arj) = A{x) + aJ{x)r} + o{a), 
where J{x) is the Jacobian matrix of the vector held A at position x. 

|yy |2 {y - X - A{x)At - a {Im + J(x)At) 77 )^ 

p(Yn+i = y\Yn = x) = J 


2(ct2 + 2DAt) 






dr 


Following the one dimensional step, from a direct integration we obtain 


= y\Yn = x) = 


( 277 )™- det 'S[x) 


--{y - X - AtA{x)fT, \x){y-x 
-.e 2 


where 


E(cc) = + 2DAt)Im + a^B{x)B^{x) (31) 

= (2(7^ + 2DAt)Im + a‘^At{J{x) + j'^{x)) + O(At^) 


and 


B{x) = /^ + AfJ(cc). 


(32) 


Formula (ITID generalizes to the n-dimensional Euclidean space the result of 
section 12.21 for dimension one. 


4.1 Estimation of a drift and diffusion tensor 

To estimate the apparent drift and diffusion tensor, we apply analytical ex¬ 
pressions for the pdf f formula IM]) . Using the formula to characterize the drift 


AtA{x)) 
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at resolution At [TB] at position x, we get 


aAt{x) = E 


Yn+i - 


At 


Y' rt. — X 


At 


{y - x)p{Yn+i = y\Yn = x)dy 


(27r)W2Aty ^detS(aj) 


{y-x)dy --iy-AtA{x) 
e ^ 


xyTi ^{x){y — AtA{x) — x) 


Using the change of variable v = y — AtA{x) — x we obtain 

1 


If 1 

dAtix) = — (v + AtA(x)) , 

At J J (27r)™- det 'S{x] 

= A{x)+o{At). 


-v^Yi ^{x)v 
e 2 ^ ’ dv 


(33) 


This approximation is valid to second order in a (see Appendix B). Similarly 
in the isotropic case, the diffusion coefficient at position x can be recovered 
from the second order moment approximation 


DAt{x) = E 


n+1 n\ 

2mAt 


I "5^ T7, X 


(34) 


Thus, using the pdf formula fl3T|) . we get 
1 


DAtix) = 


2mAt 

1 

1 

2mAt 


{v + AtA{x))^ {v + AtA{x))- 


J (27r)™' det S(a ;) 

im 

J{v^v + At(A(x)^ + A(x)) + At^A{xYA{x)) x 
1 


— v'^Hix) 
e 2 dv 


--v^Yiix) 


[X 


\J (27r)™- det S(c 

^(rr(E(.)) + 0(At^)). 


dv 


(35) 


17 




















Using eq. m, we have 


Tr(S(a;)) = m{2a^ + 2D At) + 2a^AtTr{J{x)) + 0{Af). (36) 


Finally, 


2 2 

= -D H — 7 —I - divi^A) + 0(At), 

At m 


(37) 


where by dehnition, in local coordinates div(A) = > —^ -. In general, 

—^*=1 OXi 

the diffusion tensor can be approximated at order At by 


Dl,{x) = E 


Y n X 


2At 


A+l - yn)Hyn+l - ^n)^' 

2At 

^ {v + AtA{x)y{v + AtA{x)y 


(38) 


(27r)™' det S( 


— v^Hix) 

-.e 2 dv 


X 


+ ^5,, + ^{J{x) + J\x)y^ + 0(At). 
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5 Empirical estimators for a diffusion process 
using a Maximum Likelihood procedure 

We construct now parametric empirical estimators for a stochastic process 
using a Maximum Likelihood procedure. To recover the drift a(6*i,..., 6m) 
and diffusion coefficient we shall estimate the internal parameters di,. ■ ■, 6m- 
We start with a sequence of observed points {yi,. ■ ■, Vn+i), generated by 
a stochastic model 


X = a{x, 6i,6) + b{x)w 


(39) 


perturbed by an additive Gaussian noise, as discussed in the first section. 
To determine the parameters 0 = {6i,. ■ ■ ,6m), we consider the procedure 
which consists in maximizing the transition probability conditioned on the 
sequences {yi,... ,yN+i)- The maximum likelihood estimator is computed 
from the joint probability 


piui, ■ ■ ■, Vn+i, 6i, ■ ■ ■, 6m)- 

Assuming independent and identically distributed sample, we get 

N 


(40) 


P{yi,---,yN+l,6l,...,6m) = Ylp{yn+l\yn,d), 


(41) 


where p{yn+i\yn', 0) is the transition probability from point yn at time to 
yn+i at time tn + At. It is given in dimension one in the entire line when 
b{x) = \/2D by 



iVk+i -yk- ajyk, 0)At)^ 

2cTii(2/fc,0) 



(42) 


as shown in eq. dm 


-yAtivt, 9) = 2(T^(1 + a'(yt, 9) A«) + 2D(yt)At + 0{Atf. 
The log-likelihood is defined as 


(43) 


N 


l{yi,...,yN+i\0) = ^\ogp{yn+i\yn,0) 


(44) 


n=l 


N 


1 


N 


{yn+l -yn- a{yn, 0)At)‘^ 
(^Itiyn) 



n=l 


n=l 
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The parameters 0i,..., and D are computed as maximizer of the likelihood 
function and thus by differentiating i with respect to di,... ,6^, D- The 
conditions ^ = 0 and ^ = 0 can be written as 


di 


= 0 


d(JAt{yn,0) N dcT^tiyn,6) 

E + E 

^ CTAtiVn, 0) ^ 


a{yn]0)Atf. (45) 


When the diffusion coefficient D is independent of the position, the estimator 

is 


N 


D = 


da 


2At \ N 


^{Vn+i -Vn- a{yn] 9)Atf - 2a^{l + 0)At) j + 0(At).(46) 


n=l 


Moreover, differentiation of I w.r.t. 1 < i < m, gives 

dcTAt 


di. _ dOj ^ daAt ^ {yn+i - yn - ajyn] 0)^tf 


de. 


(TAt de, ^ 

At^da{yn;0) a\A+'\ 

~T- —bh iVn+i -Vn- a{yn, 6) At). 

^At 1 

72=1 


(47) 


Conditions (H5l) and ^ = 0 thus lead to the condition on the parameters 

; • • • ; ^772 


E 


da{yn] 0) 

de. 


(l/n+l yn 


aijjn] 6) At) = 0 for i 


1... m. 


(48) 


5.1 Estimating an Ornstein-Uhlenbeck process 

An Ornstein-Uhlenbeck process sampled at short time At (eq. ([5])) is 

Xn+l = Xn — X{Xn — /i)At -|- V2DAw. 

We construct the transition probability of the observed motion as 

PiXn+i =y\Yn = x) = - --, (49) 

aAtV 2 tt 
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where 


'^At = + {2D — 2a‘^\)At + O(At^). 

The log-likelihood fl44p is now 


i{yi,.. .,yN+i\X,D) 


1 ^ 

-iVlogcJAf - 7 ; 


(l/n+l yn 4“ ^{yn 


n=l 


a 


At 


y)Atf 


Conditions ^ = 0, || = 0 and |^ = 0 lead to 

N 

^ ^ yn{yn+l yn T ^{yn 0) 

72=1 


and thus the empirical estimator A for the parameter A is 

^ Xjn=l yn{yn+l 2/n) 


A = 


Ehl !/n(!/„ - A) 


Similarly, using |^ = 0, we obtain the condition 


1 1 ^ 


yn 


(50) 


(51) 


By combining fl50|) and flSTl) we obtain 

j2 _ X/n=l X]n=l yn{yn+l ~ ^n) ~ X]n=l yni^N+l ~ l/l) 

^ yn{yn+l yn) Xjri=l yn{yN+l 2/l) 


(52) 


Finally, using fll5ll we obtain for the diffusion coefficient the following empir¬ 
ical estimator 


b = a^{\ 



+ 


1 

2NAt 


N 

^ ^ (l/n+l yn T ^{yn 

72=1 


y)Atr. 
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5.2 Estimating an Ornstein-Uhlenbeck process from 
the exact transition probability 

The maximum likelihood method to estimate the parameters of an Ornstein- 
Uhlenbeck process uses the exact transition probability of the observed mo¬ 
tion, given by 

{y- 

2(a2(l + e-2AAt) + 0(1 _ e-2AAi)) 

piXn+i = y\Yn = x) = ^ ■ (53) 

^27r(a2(l + e-2^^‘) + f (1 - e-2^^*)) 

For a trajectory of iV-|- 1 observed points (|/i,..., i/at+i), the log-likelihood is 
%i,---,2/v+i|A,/i,T>) = -^iVlog(cT2(l + + y(l - 

yi+i - y - (yt- 

(l + e-2AAt) + 0(^_g_2AAt))- 



Maximizing the log-likelihood leads for the parameters A, jl, D to the equa¬ 
tions 

di - - 

— (l/i,...,l/jv+i|A,/i,T>) = 0 

d£ ~ - 

— {yi,...,yN+i\X,y,D) = 0 

di - - 

— {yi,...,yN+i\X,y,D) = 0. (54) 

We are left with solving the three equations. However, because of the term 
A and in expression it is not possible to hnd a closed-form solution 

for the parameters. In the following, we will estimate the parameter A using 
numerical optimizations. The estimator for y and the diffusion coefficient D 
are given by 




1 (wU +- 1 


(55) 


D 


A 

1 _ g-2AAt 



N 

'^ivi+i -p-iVi 

i=l 




-XAt\2 



1 — e 


-2XAt 


-2XAt 


.(56) 
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We now compare the two maximum-likelihood estimators determined in sec¬ 
tions 15.11 and 15.21 using numerical simulations. To evaluate the performance 
of the estimators, we simulated trajectories following an Ornstein-Uhlenbeck 
process. We fixed the time-step At = 0.1 and estimated the parameters A,/i 
and D for n = 500 observations. The average and standard deviation of the 
estimated parameters A, fi and D are obtained by taking 500 realizations of 
the process. Moreover, the parameters were estimated for various values of 
the observation noise a. The results are summarized in Fig. [3l As expected, 
the estimator of section 15.21 which is based on the actual transition proba¬ 
bility of the OU process, gives better estimates than the estimator of section 

EH 

6 Discussion and Conclusion 

We presented here several empirical estimators that can be used to compute 
the first and second moment of a stochastic process from SPTs data. When a 
Gaussian noise is added to the physical process, the analysis of the estimator 
reveals that the drift and the diffusion tensor (formula fflSj) and fl20|) i are 
recovered at first order. 

The present estimators are very different from classical MSD, computed 
along trajectories. Here the estimators are based on computing the first and 
second moments using the realization of an ensemble of many trajectories. In 
addition, as shown in appendix B, computing the moments does not require 
the a priori knowledge of the probability distribution function of the process. 
Appendix A shows how the first two moments are computed by dividing the 
space in bins. 

The key message of this analysis is that the drift can be recovered entirely 
to a first order approximation in the variance amplitude a (relation fl67|) ). 
When the drift varies in space, the estimated diffusion tensor contains the 
derivative of the drift (or the divergence in higher dimension), that needs to 
be subtracted to recover the physical diffusion coefficient. 

The present analysis provides the theoretical framework for extracting 
physical parameters from super-resolution single-particle trajectories HI El 
|20] . where the drift was recovered and potential wells were estimated, with¬ 
out accounting for the additive Gaussian external noise. Here we have shown 
that to a first order approximation, the additive Gaussian noise does not 
contribute to the drift estimation (only at the second order), allowing us 
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Figure 3: Comparison of the maximum-likelihood estimators of an 
OU process. Comparison of the estimators in section 15.11 (blue) and 15.21 
(red) for an OU process with H = 1, p = 1, and various values of A. From 
left to right, the plotted estimations are /i — /i, A — A, and D — D. The 
black line is the mean of the estimation for trajectories of 500 points, and 
the colored part indicates ± the standard deviation. The observation time- 
step is At = 0.1 and from top to bottom, a = \/0T (SNR=1), a = \/0.01 
(SNR=10),a = x/(h00T (SNR=100). 
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to conclude that the estimation of the energy of the potential well was not 
affected (to order one) by an external localization noise. This analysis con- 
hrms that the previous results [Ilia [20] are valid even if there is a Gaussian 
empirical noise added. 

Finally, another key result here is that physical potential wells cannot be 
mixed with a fixed particle, as sampled with a noisy measurement. We pro¬ 
vided here a criteria to characterize that a particle is fixed and not confined 
(section |33|). In particular, detected converging arrows in a vector held [HIS] 
extracted from SPT analysis reveals a physical potential well and cannot 
have been generated by hxed particles due to the tracking algorithm. This is 
even more evident when wells are not isotropic. However the present analysis 
does not reveal the origin of the wells. 

7 Appendices 

Appendix A: Approximation formula for the 
local drift and diffusion coefficient 

Computations with the estimators developed here from empirical data de¬ 
pend on the following steps: starting with a sample of Nt observed trajec¬ 
tories i = 1, 2,.. . Nt, j = 1, 2, ..., Ng}, where tj are the sampling 

times, and Ng is the number of points in each trajectory, the dynamics is 
reconstructed by computing the local drift and diffusion coefficient of the ob¬ 
served diffusion process. First, the range of points on the line is partitioned 
into M bins of width r, centered aX Xk, such that 

Xi-^-< min{|/*(tj), l<i<Nt,l<i < Ng] 

and 

xm + ^> max{|/*(tj), I <i < Nt,l < i < Ng]. 

The effective drift and diffusion coefficient of the observed diffusion process 
are evaluated in each bin from the empirical versions of the formulas [HI HH] 

aAt(x) = ~ ^ ^ 

2Dm{x) = liiUAt^o [[y{t + At) - y{t)]‘^ \ y{t) = x] . (58) 
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( 59 ) 


The empirical version of (|S7|) at each bin point Xk is 


^At (^/c) 


J_ y^'i'tj+i) ~ y\tj) 

Nf^ At 

*=1 j=l,y^(tj)GB{x^,,Ax) 


where B{xk,r) is the bin [xk — r/2,Xk + r/2]. The condition y\tj) G B {xk, r) 
in the summation means that \y'‘{tj) — yk\ < 'r/2. The points y'‘{tj) and 
|/*(tj+i) are sampled consecutively from the tth trajectory such that y^itj) G 
B{xk,r) and the number of points in B{xk,r) is N^. Similarly, the empirical 
version of fl^ at bin point Xk is 


DAt{Xk) 


Nk 2At 

i=i j=i,y'-{tj)eB{xk,r) 


(60) 


Appendix B: higher order moment estimates 
and general inversion formula 

We present now a different approach to estimate the drift and diffusion co¬ 
efficients by using direct regular expansion. This approach do not assume 
any knowledge of the pdf of the process and is thus applicable to any general 
manifold. We start with the continuous stochastic equation of ([5]) 



X = a{X) + b{X)w, 

(61) 

and 

Y = X + a'h, 

(62) 

where both w and 77 are two independent identically distributed Brownian 
variables. The close stochastic equation for Y is 


Y = a{Y — arj) -|- b(Y — ar})w -|- arj, 

(63) 

Using 

a Taylor expansion to order k, we get 



0 

(64) 


b(Y a„) - + 

(65) 


0 
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Using a second order expansion we obtain that in dimension one, 


2 2 

Y = a{Y) - aria'{y) + + (&(>") - arjb'iY) + ^r)^b"{Y))w + ar;.(66) 

Thus, the expectation is 


lim 

—^0 


E^,n{Y{t + At)-Y{t)\Y{t) = Y) 
At 


a{Y) + ^Erjirt^)a"iY) + o{a^), 
a{Y) + ^a''{Y) + o{a^), (67) 


where we used that Erjlr]"^) = 1. We conclude that at order two, a correction 
has to be added to the drift, but when a is small this contribution is negligible. 
In particular, this result shows that at the first order the additive noise do 
not influence the recovery of the vector field and local potential wells. The 
energy is thus not affected by this additive noise. 

Similarly, the diffusion coefficient is computed from the second moment 


E„,,,((y(« + A«) - Y{t))^ I Y(t) = y) 
2At 




o{At) + o(cr^ 


( 68 ) 


The analysis presented here can be generalized to n-dimension and does not 
depend on any a priori information about the pdf of the stochastic process 
to be estimated. 
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